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Abstract. Most communication networks are complex. In this paper, we address one of the fundamental 
problems we are facing nowadays, namely, how we can efficiently protect these networks. To this end, 
we study an immunization strategy and found that it works almost as good as targeted immunization, 
but using only local information about the network topology. Our findings are supported with numerical 
simulations of the Susceptible-Infected-Removed (SIR) model on top of real communication networks, 
where immune nodes are previously identified by a covering algorithm. The results provide useful hints in 
the way to design and deploying a digital immune system. 
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1 introduction 

Communications networks have been intensively studied 
during the last several years as it turned out that their 
topology is far from being random |ll2l8l4j . In particular, 
it has been found that physical networks —the Internet— 
as well as logical— the World Wide Web— and peer-to- 
peer networks —Gnutella— are characterized by a power 
law degree distribution 0] (thus, they are referred to as 
scale-free networks |5I6) '). P{k) ^ , where the degree 
or connectivity A: of a node is the number of nodes it is at- 
tached to. These findings, together with similar network 
structures found in fields as diverse as biological, social 
and natural systems |7I8I9| . have led to a burst of activ- 
ity aimed at characterizing the structure and dynamics of 
complex networks. 

The spreading of an epidemic disease in complex net- 
works was among the relevant problems that were first 
addressed in the literature jl0lllil2j . Surprisingly, it was 
found that for infinite scale- free networks with 2 < 7 < 3, 
the epidemic always pervades the system no matter what 
the spreading rate is |11I12I13I14| . even when correlations 
are taken into account |15I16I17] . In other words, the usual 
threshold picture does not apply anymore. This fact would 
be a mere anecdote if not because most vaccination and 
public health campaigns are based on the existence of such 
a threshold 18 . In practice, it would be desirable to have 
a threshold as large as possible for a given epidemic dis- 
ease. 

Soon after the first studies on epidemic spreading, it 
was realized that traditional vaccination strategies based 
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on random immunization, while worth taking for random 
network topologies, were useless in scale-free networks 19|. 
Specifically, it was shown that a minimum fraction as large 
as 80% of the nodes has to be immunized in order to re- 
cover the epidemic threshold. New vaccination strategies 
are thus needed in order to efficiently deal with the actual 
topology of real- world networks. A very efficient approach 
consists of vaccinating the highly connected nodes in or- 
der to cut the path through which most of the susceptible 
nodes catch the epidemics jl9i2Q) . However, in order to 
do that, one has to identify the core groups or hubs of 
the system. In general, this is extremely unrealistic, par- 
ticularly for large networks and systems lacking central 
organizational rules such as social networks. 

In this paper, we consider the immunization problem 
from a different perspective. We show that it can be treated 
as a covering problem, in which a set of immune agents 
has to be placed somewhere in the network. The main 
advantage of this approach is that only local topological 
knowledge is needed up to a given distance d, so that it can 
be straightforwardly applied to a real situation. To verify 
the results of the immunization strategy, we implement 
the Susceptible-Infected-Removed epidemiological model 
|13ll4 f on top of the Internet maps at the Autonomous 
Systems (AS) and router levels ,2.3^4, and compare with 
the results obtained by using targeted and random im- 
munization as well as a local immunization strategy. Our 
results indicate that the algorithm performs quite well and 
is near the optimal one. On the other hand, we show that 
the efficiency of the vaccination strongly depends on the 
degree-degree correlations as the covering outcome is di- 
rectly related to the structure of the underlying network. 
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Fig. 1. Correlations as a function of d for the AS and router 
graph representations of the Internet. Vd is the slope of the 
curve {K^''^)k, which measures the average degree of neighbors 
at a distance d. See |21| for details of this quantity. 

2 Susceptible-Infected-Removed model on 
Real Nets 

In order to be able to compare the efficiency of the dif- 
ferent immunization strategies, we first perform extensive 
numerical simulations of an epidemic spreading process on 
top of real architectures (here, epidemics refers to any un- 
desired spreading process, i.e, virus, spam, etc). We con- 
sider the SIR model as a plausible model for epidemic 
spreading |18I13| . In this model, nodes can be in three 
different states. Susceptible nodes, S, have not been in- 
fected and are healthy. They catch the disease via direct 
contact with infected nodes, /, at a rate A. Finally, re- 
covered nodes, R, are those nodes that have caught the 
disease and have stopped spreading it with probability /3 
(without loss of generality, f3 has been set to 1 henceforth) . 
The relevant order parameter of the disease dynamics is 
the total number of nodes (or the fraction of them, R) 
that got infected once the epidemic process dies out, i.e., 
when no infected nodes are left in the system. 

On the other hand, the simulations performed through- 
out this work have been carried out on real communica- 
tion networks. The fact that any study thought to have 
practical applications should be tested in real systems led 
us to such an election. These networks have unique topo- 
logical properties difficult to gather with existing generic 
network models — namely, degree-degree correlations and 
clustering properties. The networks on top of which nu- 
merical simulations of the immunization strategies and 
the SIR dynamics have been performed are the follow- 
ing. AS: Autonomous system level graph representation 
of the Internet as of April 16th, 2001 Gnutella: Snap- 
shot of the Gnutella peer to peer network, provided by 
Clip2 Distributed Search Solutions. Router: Router level 
graph representation of the Internet |^. The three net- 
works are sparse and show an average degree around 3. 
Additionally, they are small-worlds |25j with an average 
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Fig. 2. Final fraction of infected nodes for the SIR model 
and targeted immunization with different number of immu- 
nized nodes for the AS (a) and router (b) map representations 
of the Internet. Simulations were carried out starting from a 
single infected node at t = in all cases. The plots are in a 
log-log scale for a better visualization. 



distance between vertices less than 10, and they are char- 
acterized by a power law degree distribution P(fc) ~ , 
with 7 « 2.2. A detailed characterization of these graphs 
is presented in Refs. ^ (Gnutella) and j2l4l27| (AS and 
Router graphs). 

These networks share a number of topological features 
but are radically different in their degree-degree correla- 
tions. Correlations are usually defined taking into account 
the degrees of nearest-neighbors. We have recently shown 
|21j. however, that whether a network can be regarded as 
assortative (when correlations are positive, i.e., there is a 
tendency to establish connections between vertices with 
similar degrees) or disassortative (negative correlations, 
the tendency is the opposite) depends on the distance used 
to average the degrees of the neighboring vertices. The AS 
and the Gnutella graphs show dissortative correlations for 
any value of though the correlations are smoothed as 
d grows. On the other hand, in the Router network, the 
degree correlations are assortative up to d = 2. However, 
for d > 2 the correlations become dissortative and beyond 
d > 6 start to approach the uncorrelated limit as shown 
in Fig. n 1^ . These pecuhar properties directly affect the 
outcome of algorithms run on top of these networks. 

In the following, we focus on the results obtained for 
the AS and router maps of the Internet. The behavior of 
both the epidemic spreading process and the immuniza- 
tion strategies for the Gnutella graph arc qualitatively the 
same as for the AS map, with the only difference of more 
pronounced finite-size effects. 

We have performed Monte Carlo simulations of the 
SIR model on top of the Internet maps. Starting from an 
initial state in which a randomly chosen node is infected, 
susceptible nodes catch the disease (or virus) with prob- 
ability A if they contact a spreader. In its turn, infected 
vertices become removed and do not take part anymore 
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in the spreading process at a rate P — I. The fraction 
of removed nodes, R, when no spreaders are left in the 
system gives the epidemic incidence. All results have been 
averaged over at least 1000 realizations corresponding to 
different initially infected nodes. Figure |21 shows the epi- 
demic incidence in the AS and router maps of the Internet 
as a function of the spreading rate A. 

As can be seen from the figure, the epidemic thresh- 
old is slightly larger in the router graph than in the net- 
work made up of AS's. This difference in the behavior of 
the SIR model on different representations of the Internet 
may be understood from the distinct degree-degree cor- 
relations shown by both graphs. Though we think that 
the main differences in the algorithm's performance are 
due to correlations, it should be noticed that a number 
of other topological features such as clustering and hi- 
erarchy properties may also be at the root of the dif- 
ferent behaviors. Our guess is mainly based on the per- 
formance of local algorithms such as the covering recipe 
that we will use in the next section. As for correlations, 
in the AS map representation, highly connected nodes are 
likely connected to nodes with smaller degrees. Therefore, 
the spreading process generally passes alternatively from 
highly to poorly connected nodes. In this way, the epi- 
demics has more chances to reach a number of nodes other 
than the hubs. This is not the case of the Router map, 
where it is more likely that hubs are grouped together 
and that once one of them get infected, its neighbors (also 
highly connected nodes) do so. However, when the epi- 
demics leaves the hubs, the remaining (uninfected) nodes 
are, likely, poorly connected and with high probability the 
process will die out, specially for small values of A ~ Ac. 
That is, in the router map, the epidemic reaches the hubs, 
but then goes down to nodes of decreasing degree and 
stops soon afterwards, resulting in a smaller fraction of 
infected nodes (the hubs and a few more, i.e, a tiny frac- 
tion of the network) and thus to an effective threshold 
that is larger than that for the AS. 

In order to illustrate the importance of the local prop- 
erties of the network on the performance of the immu- 
nization, we analyze the results when targeted immuniza- 
tion is implemented on each representation of the Internet. 
In targeted immunization, a fraction of highly connected 
nodes are immunized (i.e., do not get infected) in decreas- 
ing order of their degrees. In the event that there are left 
I immune nodes to be distributed within a connectivity 
class k containing > I nodes, the I immune nodes are ran- 
domly distributed within the j nodes and the results are 
averaged over at least 100 additional realizations of this 
procedure. The results depicted in the figure suggest that 
the degree-degree correlations is one of the main factors 
influencing the performance of the immunization policy. 
We see that even for small percentages of immune nodes, 
immunization performs better in the AS graph. This may 
be due to the compact distribution of hubs (which play 
a key role in targeted immunization) in the router map 
whereas for the AS representation they are distributed 
throughout the whole network. Therefore, in the AS rep- 
resentation, targeted immunization works better because 
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Fig. 3. Comparison of the immunization strategies for the 
Internet AS map. In the figure, we have represented the ra- 
tio between the epidemic incidence of the four immunization 
strategies considered (i?) and that of the original system with- 
out immunization (Rsir) for different values of (x). The legend 
refers to the following immunization strategies: the one intro- 
duced in this paper (local), targeted immunization (Kmax), 
random immunization (random) and single acquaintance im- 
munization (SAI). In this case, 1% of the non-immune nodes 
were initially infected at random. See the text for further de- 
tails. The distances considered in the local algorithm are: (a) 
d = 1, (b) d = 2, (c) d = 3,(d) d = 5. 



immune nodes are more efficient in cutting the paths lead- 
ing to poorly connected nodes, the more abundant. These 
differences will become more apparent later on when local 
immunization strategies come into play. 



3 Immunization Strategies 

Let us now summarize the local immunization strategy in- 
troduced in this work. The allocation of network resources 
to satisfy a given service with the least use of resources, 
is a frequent problem in communication networks. In our 
case, we would like to have a robust system in front of 
a disease or virus spreading process but saving resources, 
that is, using the minimum number of immune nodes. This 
is a highly topical problem in communication networks as 
it might lead to the developing and deploying of a digi- 
tal immune system to prevent technological networks from 
virus spreading. Recently j21| . we have studied a general 
covering problem in which every vertex is either covered 
or has at least one covered node at a distance at most d. 
In what follows, we show that the set of covered vertices 
C can be taken as the set of nodes to be immunized. 

The heuristic algorithm proceeds as follows |23- For 
every vertex i in the network, look for the vertex with 
the highest degree within a distance d of « and immu- 
nize it. In case there is more than one vertex with the 
highest degree, one of them is selected at random and 
immunized. Moreover, if there is already an immune ver- 
tex within the neighborhood of i, that immimization is 
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Fig. 4. Same as previous figure but for the Internet router 
map. The distances considered in the local algorithm are in 
this case: (a) d = 2, (b) d = 5, (c) d = 7,(d) d = 10. 



kept. We have shown before [21] that this local algorithm 
gives near-optimal solutions for a general distance-d cov- 
ering problem, though the result of the covering depends 
on topological features such as the degree-degree correla- 
tions. 

The immunization strategy here considered assumes 
that covered vertices are immune nodes to the spreading 
of a disease or virus. For instance, in a technological net- 
work, they could be thought of as being special devices 
devoted to filtering out any virus or attack. This would 
imply that the spreading process stops when it arrives to 
such nodes. This is of course the ideal situation. How- 
ever, it happens more often that immune nodes can not 
catch the epidemic, but they are not able to stop spread- 
ing it through other nodes — as when you have an up- 
to-date anti-virus. Therefore, we study the worse scenario 
and consider that immunized nodes just repel the virus 
cutting the path to infection spreading. 

The approach presented here is in the spirit of the im- 
munization strategy proposed by Cohen et al.|22|. Since 
the immunization algorithm is local, one only needs in- 
formation about the neighbors of a given node up to a 
distance d. This information is usually available for small 
values of d and easy to gather, in sharp contrast to tar- 
geted immunization that requires complete knowledge of 
the degree distribution 19 20 . The difference between our 
approach and that in is that we look for the highly 
connected nodes in small parts of the network, while the 
strategy developed in [22 is based on the fact that ran- 
domly selected acquaintances likely have larger connec- 
tivities than randomly chosen nodes. Thus, in general, we 
expect our strategy to perform better than that proposed 
in [221, while keeping the local character of the algorithm 
|28|. On the other hand, either the number of immune 
nodes or the distance d, which is a measure of the degree of 
local knowledge of the network topology, should be fixed. 
This makes the algorithm more parameter-constrained, 
but allows a more efficient distribution of resources. 



We have performed extensive numerical simulations 
of four different immunization schemes. The immuniza- 
tion obtained following the covering algorithm fixes the 
fraction, (x), of immune nodes in the whole network for 
each value of d. Random immunization means that a frac- 
tion (x) of immune nodes is randomly placed on the net- 
works. Targeted immunization looks for the {x)N highly 
connected nodes and immunizes them. Finally, the Single 
Acquaintance Immunization (SAI) algorithm proposed in 
j22j is run taking p — (x) and ensuring that the total num- 
ber of immune nodes is the same in both schemes. In all 
cases, the results are averaged over many realizations for 
each value of A and (x). The results are displayed in Fig. 
[SI and FigM 

As expected, targeted immunization produces the best 
results for both topologies. Note that, as discussed in the 
previous section, the performance of the algorithm de- 
pends on the specific topology and produces different re- 
sults for AS and router maps. On the other extreme we 
find random immunization, whose performance is not af- 
fected by the structure of the underlying networks. Turn- 
ing our attention to local algorithms, it is found that 
the immunization scheme based on the covering algorithm 
performs better than the SAI, even for small values of d, 
where it is truly local. In fact, it is outperformed only by 
the targeted procedure and for all values of the parame- 
ters d and A it lies between the most efficient and the SAI 
scheme. Additionally, from a practical point of view, the 
covering strategy could be a good policy since it balances 
the degree of local knowledge and the efficiency of the 
vaccination. Moreover, as all network topologies are not 
neither completely known nor completely unknown, the 
covering allows to fine-tune the value of d on a case-by- 
case base (that is, according to the degree of local knowl- 
edge of the network) and thus it is more flexible than other 
immunization strategies (recall that it is the result of an 
optimization). 

We have further explored the differences between the 
global and covering-based immunization schemes. In prin- 
ciple, one may think that as we are immunizing highly 
connected nodes, both strategies produce the same set of 
immune nodes. Obviously, this is not the case since the 
covering operates at shorter distances than targeted im- 
munization (which operates at d = _D, the diameter of 
network). In fact, a direct comparison of who the immune 
nodes are in both algorithms shows that no more than 
50% of them are the same and both sets equal only when 
d reaches the diameter of the network. Moreover, as a fur- 
ther evidence of the influence of the graph representation 
in the performance of immunization schemes, it is found 
that for the router level the percentage above can increase 
up to 70%. 

Let us now restrict our discussion to the local (cover- 
ing) immunization scheme and focus on the influence of 
degree-degree correlations on the final size of the outbreak. 
Figures [Sj and [Sj reflect the differences in the algorithm's 
performance for the AS and the Router maps of Internet. 
Figure [HI illustrates the relative difference of the epidemic 
incidence as a function of d, taking as a reference the size 
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Fig. 5. Relative diflterence of the epidemic incidence for dif- 
ferent values of d with respect to that at d = 1 (A = 1). The 
behavior observed in the figure is determined by the number 
of susceptible nodes each immune vertex has to "protect" . See 
the text for further details. 
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of the outbreak at d = 1. The behavior depicted in the 
figure is quite similar to the dependency of the number of 
nodes covered by each immune node, (n), when d is in- 
creased . For the AS network, the fraction of infected 
nodes at the end of the epidemic spreading process rapidly 
increases. In contrast, the increase in the epidemic inci- 
dence for the router network takes place at larger values 
of d. This indicates that for the same d > 1, the immuniza- 
tion strategy works better at the router level as confirmed 
in Fig. 1^1 top panel. The reason of this behavior becomes 
apparent by noticing that for the router level (x) is bigger 
than for the AS, but the number (n) of nodes "covered" on 
average by each immune node is smaller. The combination 
of the two factors leads to a more efficient immunization 
at the router level, however, at the cost of more resources. 
Both strategies tend to be closer as d is increased because 
at the router level the correlations change beyond d > 2. 

The previous result has to be carefully interpreted and 
should not be misunderstood. A closer look at the influ- 
ence of the correlations reveals that, although in general 
they determine (x) and (n) for each map, these two quan- 
tities alone do not suffice to explain all the differences 
observed. Indeed, the local structure of the network turns 
out to be at the root of the immunization efhciency and 
the optimal trade-off between the size of the outbreak and 
the least use of resources. To see this, we have analyzed 
the situation in which both (x) (though the d's are dif- 
ferent) and (n) are almost the same in the two represen- 
tations. This case is represented in the bottom panel of 
Fig. As can be seen from the figure, in the latter case, 
the immunization scheme for the AS outperforms that for 
the router level. This behavior is due to the fact that in 
the AS network, the immune nodes are more distributed 
throughout the network because highly connected vertices 
alternate with poorly connected ones. On the contrary, at 
the router level, the hubs are topologically closer to each 
other (the correlations are positive) and thus some of the 



Fig. 6. Top: Phase transition is revealed by the best perfor- 
mance in Routers when (n) is bigger for AS (d > 1). Bottom: 
On the contrary, when the nodes covered by each immune ver- 
tex (n) is (roughly) the same, immunization works better in 
A.S. ((n) = for d = 1; (n) ~ 0.2 for d = 2 in A.S. and 
d = 5 in Routers; (n) = 1 for d = 6 in A.S. and d = 15 in 
Routers). The results were obtained starting from a randomly 
chosen infected node and setting A = 1. 

immune nodes are not highly connected resulting in a less 
efficient protection in front of an epidemic. 

4 Discussion and Conclusions 

In this paper, we have analyzed the spreading of an epi- 
demic disease on top of real communication networks both 
with and without immunization. First, we have shown that 
targeted immunization produces different results depend- 
ing on the local properties of the underlying graph by 
using different representations of the same technological 
network, the Internet. Later, we turned our attention to 
several immunization strategies and proposed a scheme 
that is neither completely local nor global, but can be 
tuned between the two extremes. The strategy introduced 
has been shown to perform better than all previous meth- 
ods irrespective of the degree of local knowledge, except 
for the case of targeted immunization. 

An important part of the work has dealt with the in- 
fluence of degree-degree correlations on the performance 
of all vaccination algorithms. To this respect, it has been 
shown that local properties are extremely important for 
the outcome of a given strategy. Moreover, the work pre- 
sented here has been performed on top of real networks, 
an thus the results are of high practical interest. An added 
value of the method developed here is that the covering- 
based strategy does not only deal with the degree of the 
immune nodes, as targeted immunization does, but natu- 
rally introduces the practical constraint of having limited 
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resources to be distributed in the system on top of which 15. 
the epidemics is spreading. Therefore, our method and the 
resuhs found can shed hght and provide useful hints in the 16. 
search of optimal immunization strategies as the develop- 
ment and deploying of a digital immune system, a highly 17. 
topical issue nowadays. 

Finally, it is worth mentioning that although we have 
not analyzed the case here, it would also be possible to 
develop an even more flexible strategy in which the im- 
munization through the covering algorithm is done with 
a variable d for the same network, that is, one can imple- 
ment an algorithm that optimize (x) locally for different 21 
neighborhoods (i.e., different values of d for each neigh- 
borhood) of a given (large) network. 22 

In summary, our work points to a new direction in de- 
signing immunization strategies, namely, the finding of a 23. 
better trade-off between resources and algorithm's perfor- 
mance. 
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Strictly speaking, our algorithm is neither completely lo- 
cal nor global. In fact, by tuning the distance d of the im- 
munization (covering) strategy one can move from a truly 
local algorithm to an algorithm close to the targeted immu- 
nization approach for d ^ D, being D the diameter of the 
network. In this sense, our method is half-a-way between 
strictly local and global strategies. This difference diffuses 
when one consider ultra-small world networks, which is not 
our case. 



